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Abstract 



The Matrix Bandwidth Minimization Problem {MBMP) seeks for a simultaneous reordering of the rows and the columns 
'of a square matrix such that the nonzero entries are collected within a band of small width close to the main diagonal. The 
^-H 'MBMP is a NP-complete problem, with applications in many scientific domains, linear systems, artificial intelligence, 
|and real-life situations in industry, logistics, information recovery. The complex problems are hard to solve, that is why 
-any attempt to improve their solutions is beneficent. Genetic algorithms and ant-based systems are Soft Computing 
f^XJmethods used in this paper in order to solve some MBMP instances. Our approach is based on a learning agent-based 
^ -model involving a local search procedure. The algorithm is compared with the classical Cuthill-McKee algorithm, and 
'^N Vith a hybrid genetic algorithm, using several instances from Matrix Market collection. Computational experiments 
-confirm a good performance of the proposed algorithms for the considered set of MBMP instances. On Soft Computing 
CN| |basis, we also propose a new theoretical Reinforcement Learning model for solving the MBMP problem. 
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1. Introduction 

Combinatorial optimization is the seeking for one or 
more optimal solutions in a well defined discrete prob- 
lem space. In real life approaches, this means that people 
are interested in finding efficient allocations of limited re- 
sources for achieving desired goals, when all the variables 
have integer values. As workers, planes or boats are in- 
divisible (like many other resources), the Combinatorial 
Optimization Problems (COPs) receive today an intense 
attention from the scientific community. 

The current real-life COPs are difficult in many ways: 
the solution space is huge, the parameters are linked, the 
decomposability is not obvious, the restrictions are hard 
to test, the local optimal solutions are many and hard 
to locate, and the uncertainty and the dynamism of the 
environment must be taken into account. All these char- 
acteristics, and other more, constantly make the algorithm 
design and implementations challenging tasks. The quest 
for more and more efficient solving methods is permanently 
driven by the growing complexity of our world. 

The Matrix Bandwidth Minimization Problem (MBMP) 
is a fundamental mathematical problem, searching for a si- 
multaneous permutation of the rows and the columns of 
a square matrix that keeps its nonzero entries as much 
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as possible close to the main diagonal. This problem is 
NP-complete in general [18| . and it remains so even in re- 
stricted solutions spaces [Sj, that is why any attempt to 
improve its solutions is beneficent. 

The main contribution of this paper is to emphasize 
the effectiveness of using soft computing methods in order 
to solve the Matrix Bandwidth Minimization Problem. Ge- 
netic algorithms and ant-based systems are natural com- 
puting methods used in this paper in order to solve the 
MBMP instances. Computational experiments confirm 
that these methods provide robust and low-cost solutions 
for the MBMP. We also introduce a new theoretical rein- 
forcement learning model for solving the MBMP. So far, 
such a learning model has not been reported in the MBMP 
literature. 

The rest of the paper is organized as follows. Section [5] 
briefly presents the matrix bandwidth minimization prob- 
lem, emphasizing its relevance and also presenting exist- 
ing approaches for solving it. The fundamentals of the soft 
computing approaches considered in this paper, i.e genetic 
algorithms, ant colony systems and reinforcement learning, 
are given in Section [S] In Section |4] we propose two natu- 
ral computing methods for solving the MBMP instances, 
namely genetic algorithms and ant colony systems. A the- 
oretical reinforcement learning model for solving MBMP 
is introduced in Section [51 Section [5] provides an experi- 
mental evaluation of the proposed methods and Section [7] 
contains some conclusions of the paper and future devel- 
opment of our work. 



2. The Matrix Bandwidth Minimization Problem 

This section introduces the concept and the hterature 
review related to the Matrix Bandwidth Minimization Prob- 
lem. 

2.1. Matrix Bandwidth Minimization Problem description 

Given a square positive symmetric matrix A = (aij)i<ij<n 
the bandwidth p is the value (i{A) = maxaij^o I* — j|- The 
Matrix Bandwidth Minimization Problem searches for a 
row (and column) permutation tt that minimizes the band- 
width for the new matrix. 

An equivalent form of MBMP uses the graph-theory 
approach, based on the layout notion. Given an undi- 
rected, connected graph G=(V, E), a layout ct is a bijec- 
tion between V and {1,2,...|V^|}. The bandwidth of G 
is l3{G) = miUfj (max^^ ^^^^([^(w) — a{v)\)). Intuitively, 
computing the bandwidth for a graph is to find a linear 
ordering of its vertices that minimizes the maximum dis- 
tance between two adjacent vertices. 

Starting from the given matrix an equivalent graph 
Gyi = (y, E) can be defined and the MBMP can be viewed 
as the problem of minimizing the bandwidth of Gj\^. In 
this graph, the set of vertices is V — {1, 2, . . . , n} and two 
vertices i and j are connected through an edge iff Oij ^ 0, 
\.BE={{i,j)iffa,,+^}. 

The current exact approaches devise algorithms that 
solve the general MBMP in 0(4.83") running time Q. 
Classic results for approximation approaches establish an 
approximation factor of O(log'^-^n) for general MBMP Q 
and 0{log'^-^n) for trees [10|. 

The MBMP arose in the solving systems of linear equa- 
tion; the ordering of the system matrix has great impact 
on the resources needed when actually solving the system, 
and may lead to a substantial efficiency increase. Mini- 
mizing the bandwidth of a matrix helps in improving the 
efficiency of certain linear algorithms, like Gaussian elim- 
ination. 

The MBMP current applications in computer science 
include VLSI design, network survivability, data storage. 
Other applications are in electromagnetic industry [6] , large- 
scale power transmission systems, chemical kinetics and 
numerical geophysics [iBj, information retrieval in hyper- 
text Q. 

Some generalizations of MBMP are currently investi- 
gated by the world researchers. For example, the two- 
dimensional bandwidth problem is to embed a graph into 
a planar grid such that the maximum distance between 
adjacent vertices is as small as possible 14 1. 



2.2. Literature review. 

The importance of the bandwidth minimization prob- 
lem is also reflected by the large number of publications 
describing algorithms for solving it. Cuthill and McKee 
propose in 1969 in 2| the first stable heuristic method for 
MBMP: the CM algorithm with Breadth-First Search. 



Marti et al. have used in [l5[ Tabu Search for solving 
the MBMP problem. They used a candidate list strategy 
to accelerate the selection of moves in the neighborhood 
of the current solution. 

A GRASP with Path Relinking method given by Pinana 
et al. in [fol has been shown to achieve better results than 
the Tabu Search procedure but with longer running times. 
Lim et al. propose in [Tlj a Genetic Algorithm integrated 
with Hill Climbing to solve the bandwidth minimization 
problem. 



A simulated annealing algorithm is shown in 2l| for the 
matrix bandwidth minimization problem. The algorithm 
proposed by Tello et al. is based on three distinguished 
features including an original internal representation of 
solutions, a highly discriminating evaluation function and 
an effective neighborhood. More recently, the Ant Colony 
Optimization metaheuristic has been used in [l3| . (20| in 
order to solve the the MBMP. 

3. Background 

In this section we will briefly review the fundamentals 
of the soft computing approaches used in this paper for 
solving the MBMP, i.e genetic algorithms, ant colony op- 
timization and reinforcement learning. 

Soft Computing is the collection of computing branches 
that cope with the imprecision, uncertainty, partial truth, 
and approximation, manifested in nature and naturally 
(and gracefully) operated by biologic entities (cells, or- 
ganisms, or collections of individuals). The goal of soft 
computing approaches is to achieve tractability, robust- 
ness and low-cost solutions, facing the real-life, complex, 
highly-dimensioned problems. 

Genetic algorithms (GAs), invented by John Holland 
in the 1960s, are the most widely used approaches to com- 
putational evolution. Genetic algorithms provide an ap- 
proach to machine learning [lij . method motivated by 
analogy to biological evolution. Hypotheses are often de- 
scribed by bit strings whose interpretation depends on the 
application, though hypotheses may also be described by 
symbolic expressions or even computer programs Q. 

Ant Colony Optimization (AGO) studies artificial sys- 
tems inspired by the behavior of real ant colonies and 
which are used to solve COPs The AGO methods use a 
set of cooperative artificial ants, each constructing a solu- 
tion, based on the expected quality of the available moves 
and on the good solutions found by the community. AGO 
demonstrated a high flexibility and strength by solving 
with very good results either academic instances of many 
COPs or real-life problems. To improve the efficiency, the 
ant-based algorithms are designed using problem-specific 
information and involve local search methods. 

The goal of building systems that can adapt to their en- 
vironments and learn from their experiences has attracted 
researchers from many fields including computer science, 
mathematics, cognitive sciences (23 ]. 
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Reinforcement learning (RL) is learning what to do - 
how to map situations to actions - so as to maximize a 
numerical reward signal. The learner is not told which ac- 
tions to take, as in most forms of machine learning, but in- 
stead must discover which actions yield the highest reward 
by trying them. In RL, the computer is simply given a goal 
to achieve. The computer then learns how to achieve that 
goal by trial-and-error interactions with its environment. 

4. Natural computing models for the MBMP 

In this section we propose two natural computing meth- 
ods for solving the MBMP instances: genetic algorithms 
and ant colony systems. The computational experiments 
from Section |6] confirm that these methods provide robust 
and low-cost solutions for the MBMP. 

4.1. Genetic Algorithm. 

In the following, a hybrid genetic algorithm (EGA) is 
proposed for solving the MBMP. The algorithm proposed 
in this section is a slight modification of the Genetic Al- 
gorithm integrated with Hill Climbing proposed by Lim et 
al. in [nl. 

Let us consider that A = (aij)i<i.j<n is the square 
symmetric matrix whose bandwidth /3 has to be mini- 
mized. 

In the HGA we use, a chromosome is a n dimensional 
sequence tti, 712, . . . 7r„ representing a permutation tt of 
{1, 2, . . . , n}. Thus, a matrix A-n- can be associated to a 
chromosome tt, i.e the matrix obtained starting from the 
matrix A by permuting it rows (and columns) in the order 
given by permutation tt. The fitness function associated 
to a chromosome tt is defined as the bandwidth of the cor- 
responding matrix, i.e fitness{'K) = j3{A-n). The problem 
consists of minimizing the fitness function, i.e finding the 
individual with the minimum associated fitness value. 

We have used the traditional structure for a genetic 
algorithm, adding a Hill Climbing step in order to quickly 
tune solutions to reach local optimum [11] . HGA algo- 
rithm operates as follows: 

i. At the beginning, an initial group of n chromosomes 
is constructed, as it will be further detailed. 

ii. Then, middle-point crossover and a fc-swap muta- 
tion are performed on this gr oup of chromosomes 
to generate new chromosomes [ll|. Hill Climbing is 
now applied to each newly-generated chromosome, 
as proposed in |ll| . As the number of individuals 



iii. Step [ii.] is repeated for a given number of gener- 
ations; the algorithm stops and the best result is 
reported as solution. 

The initial population for the HGA is constructed as fol- 
lows. Starting from matrix A, we construct the corre- 
sponding graph Ga- Then, the initial chromosomes are 
built by performing BPS on the graph, starting from each 
node. This way, n initial individuals are constructed. Ap- 
plying Hill Climbing ll| on the obtained individuals, the 
initial population for the HGA is obtained. The construc- 
tion of the initial population for the HGA is slightly dif- 
ferent from the method from [ll| . 

As further work we will investigate the appropriateness 
of replacing the Hill Climbing step from the HGA with 
other local search mechanisms, such as PSwap or MPSwap 
procedures that will be described in Subsection [ 



within a population remains n, fittest chromosomes 
will remain in the next generation. After the new 
generation is formed, a swap mutation is applied on 
all chromosomes within the new generation, except- 
ing the best one. Then, Hill Climbing is applied 
again to each newly-generated chromosome. 



4.2. Ant-based system. 

A hybridized ACO approach using a local search pro- 
cedure is proposed in this section for solving the MBMP. 
This local search method is designed to reduce the band- 
width of the current solution and is executed during the 
local search stage of the ACO framework. 

In [13 Ant Colony System (ACS) [1] is hybridized with 
a local search mechanism. The AGS model is based on 
the level structure used by the Cuthill-McKee algorithm 
[23. The local search procedure aims at improving AGS 
solutions, by reducing the maximal bandwidth. The inte- 
gration of a local search phase within the proposed AGS 
approach to MBMP facilitates the refinement of ants' so- 
lutions. 

The main stages of the proposed hybrid AGS are as 
follows. 

i. First, the current matrix bandwidth is computed, 
the pheromone trails are initialized and the param- 
eters values are established. 

ii. The construction stage consists of executing the next 
steps within a given number of iterations. At first all 
the ants are placed in the node from the first level, 
then the local search mechanism is applied. 

Each ant builds a feasible solution by repeatedly 
making pseudo-random choices from the available 
neighbors. While constructing its solution, an ant 
also modifies the amount of pheromone on the vis- 
ited edges by applying the local updating rule 

After each partial solution is built, in order to im- 
prove each ant's solution, the local search mechanism 
is applied. Finally, once all ants have finished their 
tour, the amount of pheromone on edges is modified 
again by applying the global updating rule 

iii. The best current solution is listed. 
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As illustrated above, the local search procedure is used 
twice within the proposed hybrid model: at the beginning 
of each iteration and after each partial solution is built, in 
order to improve each ant's solution. 

In (20I I two local search mechanisms are introduced: 
PSwap and MPSwap. The local search mechanisms are 
denoted by hACS and respectively hMACS. 

PSwap firstly founds the maximum and minimum de- 
grees. Then, for all indexes x with the maximum degree, 
it randomly selects an unvisited node y with a minimum 
degree and then swaps the nodes x and y. 

In order to avoid stagnation was introduced hMACS. 
First are found the maximum and minimum degrees. For 
all indexes x with the maximum degree, it randomly selects 
an unvisited node y with a minimum degree such as the 
matrix bandwidth decreases and then swaps the nodes x 
and y. 

The experimental results reported in show that 
MPSwap procedure performs better on small instances, 
while PSwap is better on larger ones. 

5. A theoretical reinforcement learning model for 
solving MBMP 

In this section we investigate a reinforcement learning 
approach for solving the MBMP problem and introduce 
our RL model. 

Let us assume, in the following, that A is the symmetric 
matrix of order n whose bandwidth has to be minimized. 

5.1. Problem definition. 

We define the RL problem associated to MBMP as in Q: 

• The environment E consists of the set of states 
{l,2,...,n} extended with a state sq that is con- 
nected to all other states, i.e. E ~ {1, 2, . . . , n} lJ{so}. 

• The initial state si of the agent in the environment 
is So- 

• A state sf € E reached by the agent at a given 
moment after it has visited states si, si, S2, . . . Sk is a 
terminal (final) state if the number of states visited 
by the agent in the current sequence is n + 1, i.e. 
k — n. 

• The transition function between the states is defined 
as h : E ^ P{E), where h{i) = {l,2,...n}. This 
means that, at a given moment, from the state i the 
agent can move to any state from E, excepting the 
initial state. We say that a state j that is accessible 
from state i (j € h{i)) is the neighbor {successor) 
state of i. 

• The transitions between the states are equiprobable, 
the transition probability P{i,j) between a state i 
and each neighbor state j of i is P{i,j) = -. 



The RL problem consists in training the MBMP agent 
to find a path si, tti, 7r2, . . . 7r„ from the initial to a fi- 
nal state, i.e a permutation tt of {1, 2, • • • jji} that mini- 
mizes the corresponding matrix bandwidth [j] . Let us con- 
sider that, at a given moment, the agent has visited states 
sj, TTi, 7r2, . . . TTfc, where k < n, TTi £ E , -Ki ^ si; V 1 < i < k 
and TTi ^ TTj, V 1 < i, J < k,i ^ j. Starting from the path 
TTi, 7r2, . . . TTfc, we construct a permutation of {1, 2, . . . , n}. 



denoted by a 



7r(l..fc) 



7r(l..fc) 77(1. ./c) 



)■ An 

element o-J^^ -*^) (yi < j < n) is computed as follows: 

- If j < k, then (jj^^- '^) — 7^^. 

- If fc < j < n and j ^ {tti, tt2, ■ . ■ Tr^}, then 

7r(l..fc) 
a/ '=J. 

- If fc < j < n and j £ {vri, 7r2, . . . Tr^}, then 
trj^"'^"'''' — s, where 1 < s < n, s ^ {tti, 7r2, . . . TTfe} 
and Elm, 1 < m < k and ii, 12, . . . , im (1 < *<} < 

V 1 < q < m) such that j — iTi-^, ii — Tr^^,..., 

Based on the definition of cr'^ti- '^) given above, it can 
be proved that a^'^^--''^ is a permutation of {1,2,..., n}. 
Now, a matrix A"^ can be obtained from the initial 
matrix A by permuting its rows (and columns) in the order 
given by permutation a'^(^--'^\ 

Consequently, a path tti , 7r2 , . . . tt^ of the agent in the 
environment corresponds to the matrix A'^ ' ' obtained 
as we have described above. 

5.2. Reinforcement function. 

As we aim at obtaining a permutation tt of {1, 2, . . . , n} 
that minimizes the matrix bandwidth, we define the rein- 
forcement function as indicated in Equation ([1]). We men- 
tion that an alternative method to define the reinforcement 
function was considered in 

• the reward received in state tt^ after states si, tti, 7r2, . . 
TTk-i were visited, denoted by r(7rfc |si, tti, . . . pik-i) 
is computed as the bandwith of matrix A"^ * ' 
minus the bandwidth of matrix A'^ ' ' . 



r(7rfc|si,7ri, . ..iTk-i) 



if fc = 1 

/3(^""'' ) ~ /3(^^"'' ) otherwise 



(1) 

Considering the reward defined in Equation ([T]), as the 
learning goal is to maximize the total amount of rewards 
received on a path from the initial to a final state, it can 
be easily proved that the agent is trained to find a permu- 
tation TT of = {1, 2, . . . , n} that minimizes the bandwidth 

of the corresponding matrix A"^ ' ' . 
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5.3. The learning process. 

During the training step of the learning process, the 
agent will determine its optimal policy in the environment, 
i.e the policy that maximizes the sum of the received re- 
wards. During the training process, the states' utilities 
estimations converge to their exact values, thus, at the 
end of the training process, the estimations will be in the 
vicinity of the exact values. 

It is proved that the RL algorithm (such as SARSA 
22| ) converges with probability 1 to an utility function as 
long as all state-action pairs are visited an infinite number 
of times and the policy converges in the limit to the Greedy 
policy. 

Consequently, after the training step of the agent has 
been completed, the solution learned by the agent will be 
constructed by starting from the initial state and following 
the Greedy policy until a solution is reached. From given 
state z, using the Greedy policy, the agent moves to an 
unvisited neighbor j of i having the maximum utility value. 
The solution of the MBMP reported by the RL agent is a 
permutation tt of {1, 2, . . . , n} such that UItti) > U{tt2) > 
■ ■ ■ > C/(7r„), U being the utility function learned by the 
agent during its training. Considering the general goal of 
a RL agent, it can be proved that the permutation it of 
{1, 2, . . . , n} learned by the MBMP agent converges to 
the permutation that corresponds to the matrix with the 
minimum bandwidth. 



6. Computational experiments 

In this section follows the comparative evaluation of 
the techniques proposed in Section |4] in order to solve the 
MBMP. The results are compared with those reported by 
CM algorithm [2] . 

Nine benchmark instances from National Institute of 
Standards and Technology, Matrix Market, Harwell-Boeing 
sparse matrix collection [17| were used in the computa- 
tional experiments. In Table [T] are illustrated, for each 
considered instance, the following characteristics: number 
of lines, number of columns and number of nonzero entries. 



Table 1: The benchmark instances. 



No. 


Instance 


Euclidean Characteristics 


1 


can_24 


24 24 92 


2 


can_61 


61 61 309 


3 


can_62 


62 62 140 


4 


can_73 


73 73 225 


5 


can_96 


96 96 432 


6 


can_187 


187 187 839 


7 


can_229 


229 229 1003 


8 


can_256 


256 256 1586 


9 


can_268 


268 268 1675 



The hybrid genetic algorithm HGA (Subsection 14. ip 
and the hybrid ant systems hACS and hMACS (Subsec- 
tion 14. 2p were implemented and applied for the instances 
described in Table [TJ Some details regarding the imple- 
mentations of HGA, HAGS and hMAGS are following. 

The Hybrid GA is based on a Delphi implementation 
and is tested with 10% mutation rate, k — [n/10] 
and 50, respectively 100 generations. GAl is denoted the 
hybrid genetic algorithm with 50 generations and GA2 the 
hybrid genetic algorithms with 100 generations. 

The hybrid ant algorithms ^ hAGS and hMACS are 
implemented in Java. For each instance, both algorithms 
are executed 20 times. 

The parameter values for both implementations are: 10 
ants, 10 iterations, go = 0.95, (3 — 2, p — 0.001, tq — 0.1. 
The algorithms were compiled on an AMD 2600 computer 
with 1024 MB memory and 1.9 GHz CPU clock. 

In Table [5] are comparatively illustrated the best solu- 
tion (the bandwidth of the matrix) obtained by CM, hA CS, 
hMACS, GAl and GA2 algorithms for the instances given 
in Table m 

Table 2: Comparative results. 



Instance no. 


CM 


hACS 


hMACS 


GAl 


GA2 


1 


8 


14 


11 


6 


6 


2 


26 


43 


42 


19 


19 


3 


9 


20 


12 


8 


8 


4 


27 


28 


22 


22 


23 


5 


23 


17 


17 


25 


25 


6 


23 


63 


33 


53 


51 


7 


49 


120 


120 


63 


63 


8 


116 


148 


189 


91 


91 


9 


134 


165 


210 


90 


90 



A graphical representation of the results is given in 
Figure [1] Based on Figure [1] some conclusions follows. 

Excepting two instances (6 and 7) the hybrid natural- 
based algorithms provide better result than CM algorithm. 
hMACS algorithm performs better than hACS algorithm 
on small instances, while hACS algorithm is better than 
hMACS on larger ones. For six instances the hybrid ge- 
netic algorithm performed better than ant-based algorithms. 
The number of generations considered for GAl and GA2 
has no significant influence on the results. 

In order to assure a better convergence to the solution, 
the ant-based hybrid models should offer an " ideal" set of 
parameters and also a good strategy of placing the agents 
in the environment. 

The Matrix Bandwidth Minimization Problem's results 
could be improved using reinforcement learning in new hy- 
brid natural based-computing techniques. 
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Figure 1: Comparative results. 

7. Conclusions and further work 

The Matrix Bandwidth Minimization Problem (MBMP) 
is a classic mathematical problem, relevant to a wide range 
of complex real life applications. The problem is NP- 
complete and a lot of research was conducted in order to 
find appropriate solutions. 

Nowadays, bio-inspired heuristics are successfully used 
to solve difficult problems. On this basis, the paper de- 
scribes several soft computing approaches for solving the 
MBMP. The proposed heuristics are hybrid algorithms: 
genetic algorithms and ant colony algorithms. Some stan- 
dard MBMP instances are tested using the hybrid bio- 
inspired algorithms and compared with existing literature. 
The results are encouraging. 

A new theoretical reinforcement learning model for solv- 
ing the considered problem is also introduced. Computa- 
tional experiments confirmed a good performance of the 
proposed algorithms, emphasizing the effectiveness of soft 
computing methods in order to solve the MBMP. 

Further work will be made in order to detail the pro- 
posed reinforcement learning model. More exactly, we pro- 
posed to develop a RL algorithm for training the MBMP 
agent and to experimentally validate the RL model. We 
will also investigate new local search procedures in order 
to improve the performance of the ant system and of the 
genetic algorithm proposed for solving the Matrix Band- 
width Minimization Problem. 
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